๐ Getting Started#
This tutorial guides users through analyzing, visualizing, and generating synthetic birdsongs using the wavesongs Python package.
Note
This example shows the interactive capabilities of Matplotlib backends, and this will not appear in the static documentation. Please run this code on your machine to see the interactivity.
If you are using Google Colab uncomment and execute the following cell:
Show code cell content
# ! git clone https://github.com/wavesongs/wavesongs
# ! pip install wavesongs
# # enable hird party widgets (ipympl)
# from google.colab import output
# output.enable_custom_widget_manager()
# # correct ProjDirs parameters,
# audios = "/content/wavesongs/assets/audio"
# results = "/content/wavesongs/assets/audio/results"
Caution
If you encounter an error message, reset the kernel and run the above cell again. After that, everything should work successfully.
Libraries#
Import the required classes and functions from wavesongs:
# %matplotlib ipympl
from wavesongs.utils.paths import ProjDirs # Manages project files
from wavesongs.objs.syllable import Syllable # Syllable ws objects
from wavesongs.objs.song import Song # Song ws objects
from wavesongs import plot # Plotting utilities
from wavesongs.utils.tools import get_roi # Data extraction tools
Note
The first line change the matplotlib backend to ipympl, this enable interactive plots. For more information about interactive plots visit visit the oficial matplotlib documentation, interactive figures. If you are using a python or ipython terminal, omit this line.
Project Directory#
To use objects like Syllable or Song, first define a project directory using a ProjDirs object. This object manages generated files such as images, audio files, and CSVs. The default project structure is:
๐ project_root
โโโ ๐ assets
โโโ ๐ audios
| โโโ <audio sample files>
| โโโ <meta data>.csv
โโโ ๐ results
โโโ ๐ audios
โโโ ๐ figures
โโโ ๐ mg_params
proj_dirs = ProjDirs(audios="./assets/audio", results="./assets/results")
You can check the audio files found with the find_audios method
proj_dirs.find_audios()
['574179401 - Zonotrichia Capensis.wav',
'XC388622 - Ocellated Tapaculo - Acropternis orthonyx.mp3']
Song Object#
Next, create a Song instance with the project directory and a file name. In this tutorial, we use the audio file 574179401 - Zonotrichia Capensis.wav located at assets/audio/ but you can use your own records or download samples from Xeno Canto.
Once the song is defined, compute its acoustic features. You can specify parameters such as the Fast Fourier Transform (FFT) window and fundamental frequency (FF) method.
To display the spectrogram, use the plots module and the spectrogram_waveform function.
The name assigned to the song and syllables, โCopetรณn,โ is the common name for Zonotrichia capensis in Colombia.
copeton_song = Song(proj_dirs, file_id="574179401")
copeton_song.acoustical_features(umbral_FF=1.4, NN=256)
# Display the song's spectrogram and waveform
plot.spectrogram_waveform(copeton_song, save=False)
copeton_song.play()
Tip
Long audio files can slow down plot response. A best practice is to define a region of interest (RoI), such as a complete birdsong.
Birdsong#
To select a specific time interval, set select_time=True and define the start and end of the RoI.
tlim_roi = (0.7, 2.7) # Time interval for the RoI
copeton_syllable = Song(proj_dirs, file_id="574179401", tlim=tlim_roi)
copeton_syllable.acoustical_features(umbral_FF=1.4, NN=256, ff_method="yin", flim=(1e2, 2e4))
klicker_times = plot.spectrogram_waveform(copeton_syllable, tlim=tlim_roi, select_time=True, save=False)
copeton_syllable.play()
This function returns an array of measurements, where each element is a vector containing time and frequency range values.
Warning
If no selection is made, an error will appear.
Note
The fundamental frequency is calculated using the YIN or PYIN algorithm. While the algorithm generally provides accurate results, the initial portion of the signal may not be computed as effectively.
data = get_roi(klicker_times)
tlim, flim = data[0] # index = 0: first pair
# Add the initial time from the syllable
t0 = tlim[0] + copeton_syllable.t0_bs
tend = tlim[1] + copeton_syllable.t0_bs
print(f"Region of Interest: (tini = {t0:.4f} s, tend = {tend:.4f} s)")
Region of Interest: (tini = 0.8781 s, tend = 1.3006 s)
Syllable#
Define a syllable object within the selected RoI
# tlim = (0.8781, 1.3006) # For testing purpose
copeton_syllable_0 = Syllable(obj=copeton_syllable, tlim=tlim, type="intro-down", no_syllable="0")
copeton_syllable_0.acoustical_features(NN=256)
# Display the syllable's spectrogram and waveform
plot.spectrogram_waveform(copeton_syllable_0, ff_on=True, save=True)
Image save at ../../assets/results/figures/574179401-ZonotrichiaCapensis-0-intro-down
copeton_syllable_0.play()
# Write audio
copeton_syllable_0.write_audio()
Audio saved at ../../assets/results/audios/574179401-ZonotrichiaCapensis-syllable-0.wav.
Synthetic Syllable#
To generate a synthetic syllable use the optimizer module, wavesongs.optimizer, which solves a minimization problem to find the optimal motor gesture coefficient such that generate the most similar syllable.
Optimal Coefficients#
The birdsong model consists of four coefficients that define the labial tension (\(\beta\)) and air-sac pressure (\(\alpha\)): (\(a_0, b_0, b_1, b_2\)). Although both variables change over time, but we are just interested in the coefficients.
from wavesongs import optimizer
optimal_z = optimizer.optimal_params(syllable=copeton_syllable_0, Ns=10, full_output=True)
print(f"\nOptimal model values (alpha and beta coefficients):\n\t{optimal_z}")
Computing a0*...
Optimization terminated successfully.
Current function value: 0.013981
Iterations: 1
Function evaluations: 2
Optimal values: a_0=0.0010, t=0.76 min
Computing b0*, b1*, and b2*...
Optimization terminated successfully.
Current function value: 0.003349
Iterations: 37
Function evaluations: 75
Optimal values: b_0=-0.1973, b_2=1.2033, t=20.21 min
Optimization terminated successfully.
Current function value: 0.003350
Iterations: 13
Function evaluations: 27
Optimal values: b_1=0.9992, t=6.27 min
Time of execution: 27.24 min
Optimal model values (alpha and beta coefficients):
{'a0': 0.00105, 'b0': -0.19726, 'b1': 0.99924, 'b2': 1.20333}
Solution#
Now, generate and visualize the synthetic syllable using the computed optimal coefficients, optimal_z.
# optimal_z = {'a0': 0.00105, 'b0': -0.19726, 'b1': 0.99924, 'b2': 1.20333} # Testing purpose
synth_copeton_syllable_0 = copeton_syllable_0.solve(z=optimal_z, method="best")
plot.spectrogram_waveform(synth_copeton_syllable_0, ff_on=True, save=True)
Image save at ../../assets/results/figures/synth-574179401-ZonotrichiaCapensis-0-intro-down
synth_copeton_syllable_0.play()
# Write synthetic sample
synth_copeton_syllable_0.write_audio()
Audio saved at ../../assets/results/audios/synth-574179401-ZonotrichiaCapensis-synth-syllable-0.wav.
In the solve() method, there are two approaches available for generating synthetic songs:
bestMethod: The labial tension is calculated using the formula:\[ \beta(t) = b_0 + b_1 \cdot \tilde{FF} + b_2 \cdot \tilde{FF}^2 \]where \(\tilde{FF}\) represents the fundamental frequency normalized by \(10^4\) Hz.
fastMethod: The labial tension is computed as:\[ \beta(t) = b_0 + b_1 \cdot t + b_2 \cdot t^2 \]where \(t\) denotes time.
Additionally, you can export and import the entire Syllable object for further use or analysis.
Saving Curves#
Export#
Exporting syllable parameters
synth_copeton_syllable_0.export_mg(export_curves=True)
Curves arrays saved at ../../assets/results/mg_params/synth-574179401-ZonotrichiaCapensis-0-curves.csv
Motor gesture parameters saved at ../../assets/results/mg_params/574179401-ZonotrichiaCapensis-0-mg.csv.
Import#
Importing parameters from csv file
proj_dirs = ProjDirs(audios="../../assets/audio", results="../../assets/results")
imported_synth_copeton_syllable_0 = proj_dirs.import_mg("574179401", no_syllable=0)
plot.spectrogram_waveform(imported_synth_copeton_syllable_0, ff_on=True)
Image save at ../../assets/results/figures/synth-574179401-ZonotrichiaCapensis-0-intro-down
This process generates a CSV file containing all the parameters of the Syllable. The file name ends with "mg". If the best method is selected, an additional file will be created during export, ending with "curve". This file includes the curve arrays for time (\(t\)), air-sac pressure (\(\alpha(t)\)), and labial tension (\(\beta(t)\)). By default, both files are stored in the directory assets/results/mg_params/.
imported_synth_copeton_syllable_0.play()
Plots#
Comparing Samples#
plot.syllables(copeton_syllable_0, synth_copeton_syllable_0, ff_on=True)
Image save at 574179401-ZonotrichiaCapensis-0-intro-down-SoundAndSpectros.png
Motor Gesture#
plot.alpha_beta(synth_copeton_syllable_0)
Image save at synth-574179401-ZonotrichiaCapensis-0-intro-down-mg_params.png
Physical Variables#
plot.phsyical_variables(synth_copeton_syllable_0)
Image save at synth-574179401-ZonotrichiaCapensis-0-intro-down-PhysicalVariables.png
Score Variables#
plot.scores(copeton_syllable_0, synth_copeton_syllable_0, save=True)
Image save at 574179401-ZonotrichiaCapensis-0-intro-down-ScoringVariables.png
This tutorial introduced WaveSongs for bioacoustic analysis, covering setup, song processing, spectrogram visualization, and synthetic syllable generation. By following these steps, you can analyze bird vocalizations and optimize synthetic models for research or educational purposes.
Happy coding! ๐ต๐ฌ